Second-Order Guarantees of Stochastic Gradient Descent in Nonconvex Optimization
نویسندگان
چکیده
Recent years have seen increased interest in performance guarantees of gradient descent algorithms for nonconvex optimization. A number works uncovered that noise plays a critical role the ability recursions to efficiently escape saddle-points and reach second-order stationary points. Most available limit component be bounded with probability one or sub-Gaussian leverage concentration inequalities arrive at high-probability results. We present an alternate approach, relying primarily on mean-square arguments show more relaxed relative bound variance is sufficient ensure efficient from saddle points without need inject additional noise, employ alternating step sizes, rely global dispersive assumption, as long direction every point.
منابع مشابه
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Asynchronous parallel implementations of stochastic gradient (SG) have been broadly used in solving deep neural network and received many successes in practice recently. However, existing theories cannot explain their convergence and speedup properties, mainly due to the nonconvexity of most deep learning formulations and the asynchronous parallel mechanism. To fill the gaps in theory and provi...
متن کاملStochastic Recursive Gradient Algorithm for Nonconvex Optimization
In this paper, we study and analyze the mini-batch version of StochAstic Recursive grAdient algoritHm (SARAH), a method employing the stochastic recursive gradient, for solving empirical loss minimization for the case of nonconvex losses. We provide a sublinear convergence rate (to stationary points) for general nonconvex functions and a linear convergence rate for gradient dominated functions,...
متن کاملLinear Convergence of Accelerated Stochastic Gradient Descent for Nonconvex Nonsmooth Optimization
In this paper, we study the stochastic gradient descent (SGD) method for the nonconvex nonsmooth optimization, and propose an accelerated SGD method by combining the variance reduction technique with Nesterov’s extrapolation technique. Moreover, based on the local error bound condition, we establish the linear convergence of our method to obtain a stationary point of the nonconvex optimization....
متن کاملOn the convergence properties of a K-step averaging stochastic gradient descent algorithm for nonconvex optimization
Despite their popularity, the practical performance of asynchronous stochastic gradient descent methods (ASGD) for solving large scale machine learning problems are not as good as theoretical results indicate. We adopt and analyze a synchronous K-step averaging stochastic gradient descent algorithm which we call K-AVG. We establish the convergence results of KAVG for nonconvex objectives, and s...
متن کاملOn Nonconvex Decentralized Gradient Descent
Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been proposed for convex consensus optimization. However, on consensus optimization with nonconvex objective functions, our understanding to the behavior of these algorithms is limited. When we lose convexity, we cannot hope for obtaining globally optimal solutions (though we st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Automatic Control
سال: 2022
ISSN: ['0018-9286', '1558-2523', '2334-3303']
DOI: https://doi.org/10.1109/tac.2021.3131963